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An Instruction Sequence Semigroup with 
Involutive Anti-Automorphisms 


J.A. BERGSTRA! and A. PONSE! 


Abstract 


We introduce an algebra of instruction sequences by presenting a 
semigroup C' in which programs can be represented without directional 
bias: in terms of the next instruction to be executed, C’ has both for- 
ward and backward instructions and a C-expression can be interpreted 
starting from any instruction. We provide equations for thread extrac- 
tion, i.e., C’s program semantics. Then we consider thread extraction 
compatible (anti-) homomorphisms and (anti-)automorphisms. Finally 
we discuss some expressiveness results. 


Introduction 


In this paper three types of mathematical objects play a basic role: 


1. Pieces of code, i.e., finite sequences of instructions, given some set ZT 


of instructions. A (computer) program is in our case a piece of code 
that satisfies the additional property that each state of its execution 
is prescribed by an instruction (typically, there are no jumps outside 
the range of instructions). 


. Finite and infinite sequences of primitive instructions (briefly, SPIs), 


the mathematical objects denoted by pieces of code (in particular 
by programs). Primitive instructions are taken from a set U/ that 
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(possibly after some renaming) is a strict subset of Z. The execution 
of a SPI is single-pass: it starts with executing the first primitive 
instruction, and each primitive instruction is dropped after it has been 
executed or jumped over. 


3. Threads, the mathematical objects representing the execution behav- 
ior of programs and used as their program semantics. Threads are 
defined using polarized actions and a certain form of conditional com- 
position. 


While each (computer) program can be considered as representing a 
sequence of instructions, the converse is not true. Omitting a few lines of 
code from a (well-formed) program usually results in an ill-formed program, 
if the remainder can be called a program at all. Before we discuss the in- 
struction sequence semigroup mentioned in the title of this paper we briefly 
consider “threads”, the mathematical objects representing the execution be- 
havior of programs, or, more generally, of instruction sequences. Threads 
as considered here resemble finite state schemes that represent the execu- 
tion of imperative programs in terms of their (control) actions. We take an 
abstract point of view and only consider actions and tests with symbolic 
names (a, 6,...): 


In this picture, 


[a] [a] models the execution of action a and 


its descent leads to the state thereafter 


(and likewise for [c]); 


Y 
b 

y \ (b) models a the execution of test action 
c] 


b; its left descent models the “true-case” 
[ (d) and its right one the “false-case” 


|| ye (and likewise for (d)); 


S S is the state that models termination. 


Finite state threads as the one above can be produced in many ways, and 
a primary goal of program algebra (PGA) is to study which primitives and 
program notations serve that purpose well. The first publication on PGA 
is the paper [7]. A basic expressiveness result states that the class of SPIs 
that can be directly represented in PGA (the so-called periodic SPIs) corre- 
sponds with these finite state threads: each PGA-program produces upon 
execution a finite state thread, and conversely, each finite state thread is 
produced by some PGA-program. 
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In this paper we introduce a set of instructions that also suits the above- 
mentioned purpose well and that at the same time has nice mathematical 
properties. Together with concatenation—its natural operation—it forms a 
semigroup with involutions that we call C (for “code”). A simple involu- 
tive anti-automorphism? transforms each C-program into one of which the 
interpretation from right to left produces the same thread as the original 
program. Furthermore we define some homomorphisms and automorphisms 
that preserve the threads produced by C-expressions, thereby exemplifying 
a simple case of systematic program transformation. We generalize this 
approach by defining bijections on finite state threads and describe the 
associated automorphisms and anti-automorpisms on C’, which all are gen- 
erated from simple involutions. Finally, we study a few basic expressiveness 
questions about C. 

The paper is structured as follows: In Section 2 we review threads in the 
setting of program algebra. Then, in Section 3 we introduce the semigroup 
C' of sequences of instructions that this paper is about. In Section 4 we 
define thread extraction on C, thereby giving semantics to C-expressions: 
each C’-expression produces a finite state thread. In Section 5 we define ‘C- 
programs’ and show that these are sufficient to produce finite state threads. 
Furthermore, only certain test instructions in C’ are necessary to preserve 
C’s expressive power. 

Section 6 is about a thread extraction preserving homomorphism on C’ 
and a related anti-homomorphism. Then, in Section 7 we define a natural 
class of bijections on threads and establish a relation with a class of auto- 
morphisms on C’, and in Section 8 we do the same thing with respect to a 
related class of anti-automorphisms on C. 

In Section 9 we further consider C’s instructions in the perspective 
of expressiveness and show that restricting to a bound on the counters of 
jump instructions yields a loss in expressive power. In Section 10 we use 
Boolean registers to facilitate easy programming of finite state threads, and 
in Section 11 we relate the length of a C-program to the number of states 
of the thread it produces. 

In Section 12 we discuss C’ as a context in which some fundamental 
questions about programming can be further investigated and come up with 
some conclusions. 

The paper is ended with an appendix that contains some background 
information (Sections A, B and C). 


“We refer to [11] as a general reference for algebraic notions. 
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2 Basic Thread Algebra 


In this section we review threads as they emerge from the behavioral ab- 
straction from programs. Most of this text is taken from [14]. 


Basic Thread Algebra (BTA) is a form of process algebra which is 
tailored to the description of sequential program behavior. Based on a set 
A of actions, it has the following constants and operators: 


e the termination constant S, 
e the deadlock or inaction constant D, 
e for each a € A, a binary postconditional composition operator _<dab_. 


We use action prefizing ao P as an abbreviation for Pal P and take o to 
bind strongest. Furthermore, for n > 1 we define a” o P by ato P=aoP 
and a"t!o P=ao(a"oP),. 

The operational intuition is that each action represents a command 
which is to be processed by the execution environment of the thread. The 
processing of a command may involve a change of state of this environment.? 
At completion of the processing of the command, the environment produces 
a reply value true or false. The thread P dal Q proceeds as P if the 
processing of a yields true, and it proceeds as Q if the processing of a yields 
false. 

Every thread in BTA is finite in the sense that there is a finite upper 
bound to the number of consecutive actions it can perform. The approzi- 
mation operator 7: N x BTA — BTA gives the behavior up to a specified 
depth. It is defined by 


1 a0) =D, 
2. a(n+1,S) =S, a(n +1,D) =D, 


3. mn+1,PdalQ) =7(n,P) dal a(n, Q), 


for P,Q € BTA and n € N. We further write 7,,(P) instead of r(n, P). We 
find that for every P € BTA, there exists an n € N such that 


tal) STP Se SE 


°For the definition of threads we completely abstract from the environment. In Ap- 
pendix C we define services which model (part of) the environment, and thread-service 
composition. 
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Following the metric theory of [1] in the form developed as the basis 
of the introduction of processes in [5], BTA has a completion BTA® which 
comprises also the infinite threads. Standard properties of the completion 
technique yield that we may take BTA®™ as the cpo consisting of all so-called 
projective sequences:4 


BTA® = {(Pr)nen | Vn € N (Py € BTA & ti (Pasi) = Pa)}- 


For a detailed account of this construction see [3] or [15]. On BTA, equal- 
ity is defined componentwise: (Pr)nen = (Qn) nen if for alln € N, P, = Qn. 

Overloading notation, we now define the constants and operators of 
BTA on BTA™: 


1. D=(D,D,...) and S = (D,$,5,...); 
Ro =D, 
Rpt =P See Os. 


The elements of BTA are included in BTA® by a mapping following this 
definition. E.g., 


2. (Pn)nen dab (Qn)neNn == (Rn)nen with 


acoSt++(Pr)nen with Py =D, Pi =aoD and for n> 2, P, =aoS. 


It is not difficult to show that the projective sequence of P € BTA thus 
defined equals (7,(P))nen. We further use this inclusion of finite threads 
in BTA® implicitly and write P,Q,... to denote elements of BTA®. 

We define the set Res(P) of residual threads of P inductively as follows: 


1. Pe Res(P), 
2. Qdal Re Res(P) implies Q € Res(P) and R € Res(P). 


A residual thread may be reached (depending on the execution environment) 
by performing zero or more actions. A thread P is regular if Res(P) is finite. 
Regular threads are also called finite state threads. 

A finite linear recursive specification over BTA® is a set of equations 


C= tj 


for 7 € I with J some finite index set, variables x;, and all t; terms of the 
form S, D, or 7; Jab a, with j,k € I. Finite linear recursive specifications 
represent continuous operators having unique fixed points [15]. 


“The cpo is based on the partial ordering C defined by DE P, and PC P’, QC Q’ 
implies Pd al>QCP’dalQ’. 
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Theorem 1. For all P € BTA®™, P is regular iff P is the solution of a 
finite linear recursive specification. 


Proof. Suppose P is regular. Then Res(P) is finite, so P has residual 


threads P,,...,P, with P = P,. We construct a finite linear recursive 
specification with variables 71,...,2%, as follows: 
D if P, =D, 
t= 4S if P, =S, 


gjsiaba, if P= Pj) dab . 


For the converse, assume that P is the solution of some finite linear 
recursive specification FE with variables x71,...,2%,. Because the variables in 
FE have unique fixed points, we know that there are threads P),..., Py € 
BTA® with P = P, and for every 7 € {1,...,n}, either P; = D, P; =S, or 
P, = P; da P, for some j,k € {1,...,n}. We find that Q € Res(P) iff 
Q = P; for some i € {1,...,n}. So Res(P) is finite, and P is regular. 


Example 1. The regular threads a20D and a® = aoao--- are the respective 
fixed points for x1 in the finite linear recursive specifications 


1. {ep [0 OZ; T2 = A0%3, x3 = D}, 
2: Ay SO Oi he 


In reasoning with finite linear recursive specifications, we shall often 
identify variables and their fixed points. For example, we say that P is the 
thread defined by P = ao P instead of stating that P equals the fixed point 
for x in the finite linear recursive specification x = ao. In this paper we 
write 


Treg 


for the set of regular threads in BTA’. 

An elegant result based on [2] is that equality of recursively specified 
regular threads can be easily decided. Because one can always take the dis- 
joint union of two finite linear recursive specifications it suffices to consider 
a single finite linear recursive specification {P; = t; | 1 <i< n}. Then 
P, = P; follows from m~1(Pi) = m-1(P;). Thus, it is sufficient to decide 
whether two certain finite threads are equal. In Appendix B we provide a 
proof sketch. 
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3 C,a Semigroup for Code 


In this section we introduce the sequences of instructions that form the 
main subject of this paper. We call these sequences “pieces of code” and 
use the letter C' to represent the resulting semigroup. The set A of actions 
represents a parameter for C (as it does for BTA). 


For a € A and k ranging over NT (i.e., N \ {0}), C-expressions are of 
the following form: 


poe /a|+/a|—/a| /#k|\o| +\o| -\a] \#e | ! abe 


In C the operation “;” is called concatenation and all other syntactical 
categories are called C-instructions: 


/a is a forward basic instruction. It prescribes to perform action a and 
then (irrespective of the Boolean reply) to execute the instruction 
concatenated to its right-hand side; if there is no such instruction, 
deadlock follows. 


+/a and —/a are forward test instructions. The positive forward test in- 
struction +/a prescribes to perform action a and upon reply true 
to execute the instruction concatenated to its right-hand side, and 
upon reply false to execute the second instruction concatenated to 
its righthand side; if there is no such instruction to be executed, dead- 
lock follows. For the negative forward test instruction —/a, execution 
of the next instruction is prescribed by the complementary replies. 


/#k is a forward jump instruction. It prescribes to execute the instruc- 
tion that is k positions to the right and deadlock if there is no such 
instruction. 


\a, +\a, —\a and \#k are the backward versions of the instructions men- 
tioned above. For these instructions, orientation is from right to left. 
For example, \a prescribes to perform action a and then to execute 
the instruction concatenated to its left-hand side; if there is no such 
instruction, deadlock follows. 


! is the termination instruction and prescribes successful termination. 


# is the abort instruction and prescribes deadlock. 
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For C there is one axiom: 
(XOY ZY 2) (1) 


By this axiom, C is a semigroup and we shall not use brackets in repeated 
concatenations. As an example, 


+/a;!;\#2 


is considered an appropriate C’-expression. The instructions for termination 
and deadlock are the only instructions that do not specify further control 
of execution. 

Perhaps the most striking aspect of C' is that its sequences of instruc- 
tions have no directional bias. Although most program notations have a 
left to right (and top to bottom) natural order, symmetry arguments clarify 
that an orientation in the other direction might be present as well. 

It is an empirical fact that imperative program notations in the vast 
majority of cases make use of a default direction, inherited from the nat- 
ural language in which a program notation is naturally embedded. This 
embedding is caused by the language designers, or by the language that 
according to the language designers will be the dominant mother tongue 
of envisaged programmers. None of these matters can be considered core 
issues in computer science. 

The fact, however, that imperative programs invariably show a default 
directional bias itself might admit an explanation in terms of complexity 
of design, expression or execution, and C’ provides a context in which this 
advantage may be investigated. 

Thus, in spite of an overwhelming evidence of the presence of directional 
bias in ‘practice’ we propose that the primary notation for sequences of 
instructions to be used for theoretical work is C which refutes this bias. 
Obviously, from C one may derive a dialect C’ by writing a for /a, +a for 
+/a, —a for —/a and #k for /#k. Now there is a directional bias and in 
terms of bytes, the instructions are shorter. As explained in Section 5, the 
instructions \a, +\a and —\a can be eliminated, thus obtaining a smaller 
instruction set which is more easily parsed. One may also do away with a 
and —a in favor of +a, again reducing the number of instructions. Reduction 
of the number of instructions leads to longer sequences, however, and where 
the optimum of this trade off is found is a matter which lies outside the 
theory of instruction sequences per se. We further discuss the nature of C’ 
in Section 12. 
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4 Thread Extraction and C-Expressions 


In this section we define thread extraction on C’. For a C-expression X, 
|X|~ denotes the thread produced by X when execution started at the 
leftmost or “first” instruction, thus |..|~ is an operator that assigns a thread 
to a C-expression. We prove that this is always a regular thread. We also 
consider right-to-left thread extraction where thread extraction starts at the 
righmost of a C-expression. 


We will use auxiliary functions |X|; with j ranging over the integers Z 
and we define 


|X/~ = |Xh1, 


meaning that thread extraction starts at the first (or leftmost) instruction 
of X. For j € Z, |X|; is defined in Table 1. 


Let X =71;...3%, and j € Z. 


For 7 € {152.37}; 


ao|X|j44 if i; = /a, 
|X|j41 Jab |X|j49 ifi; =+/a, 
X|j¢2 Jab |X|j41 ft; =—/a, 
|X |j+k ipa = Jk, 
ao |X|j1 if i; = \q, 
|X]j = Pak 

Pare dab |X |j-2 if y= +\a, 
|X|j-2 dab |X|;1 ifi;=—-\a, 
|X |j—x if i; = \#k, 
S fips! 

if i; = #, 

HOD VEAL tsps - |= D. (2) 


Table 1: Equations for thread extraction 
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A special case arises if these equations applied from left to right define 
a loop without any actions, as in 
|/ #2; /a; \#2|1 = |/#2; /a; \#2|3 
= |/#2; /a;\#2I1. 
For this case we have the following rule: 


If the equations in Table 1 applied from left to right yield (3) 
a loop without any actions the extracted thread is D. 
Rule (3) applies if and only if a loop in a thread extraction is the result of 
consecutive jumps to jump instructions. 
In the following we show that thread extraction on C-expressions pro- 


duces regular threads. For a C-expression X we define ¢(X) € Nt to be the 
length of X, i.e., its number of instructions. 


Theorem 2. If X is a C-expression andi € Z, then |X|; defines a regular 
thread. 


Proof. Assume X is a C-expression with ((X) =n. If i ¢ {1,...,n}, then 
|X|; = D by rule (2). In the other case, a single application of the matching 
equation in Table 1 determines for each 7 € {1,...,n} an equation of the 
form 


|X|; = |X|; dal|X|,, or |X|;=|X|;, or |X|; =D, or |X|; =S (4) 


where by rule (2) we may assume that all expressions |X|; and |X|, oc- 
curring in the right-hand sides satisfy j,k € {1,...,n} (otherwise they are 
replaced by D). We construct n linear equations x; = t; with the property 
that |X|; as given by the rules for thread extraction is a fixed point for 2;: 


1. Define x; = t; from (4) by replacing each |X|; by x;. 


2. Determine with Rule (3) all equations |X|; = |X|; that define a loop 
without actions, and replace all associated equations x; = x; by 


SF D. 
3. Replace any remaining equation of the form x; = x; by 
LiF tj 


where ¢; is the right-hand side of the equation for x;. Repeating 
this procedure exhaustively yields a finite linear specification with 
variables 71,...,%n.- 


66 


For each 7 € {1,...,} the thread defined by thread extraction on |X|; is a 
fixed point for x;. Hence |X|~ is a regular thread, and so is |X|. 


Given some C-expression X, we shall often use |X|; as the identifier of 
the thread defined by |X|; as meant in Theorem 2, and similar for |X|~. 
As an example of thread extraction, consider the C-expression 


X = /a;+/b;\c;+/d; !;\#5 (5) 


It is not hard to check that X produces the regular thread P, (i.e., |X|~ = 
P,) defined by® 


Pi =aoPy 
P,=P3IbEb Py 
P3=coP, 
Py=P3idb P, 
Ps =S 


Thread extraction defines an equivalence on C-expressions, say X =_, Y if 
|X|~ = |Y|~, that is not a congruence, e.g., 


# =. /#1 but 45 /a 4. /#15/e. 


We define right-to-left thread extraction, notation 
X|~, 


as the thread extraction that starts from the rightmost position of a piece 
of code: 


[X|~ = |X |ecx) 


where ¢(X) € N* is the length of X, i.e., its number of instructions. Tak- 
ing X as defined in Example (5), we find |X|~ = |X|~ because for that 
particular X, |X|g = |X|. Right-to-left thread extraction also defines an 
equivalence on C-expressions, say X = — Y if |X|~ = |Y|~, that is not a 
congruence, ¢.g., 


# =—\#1 but /a;# A /a;\#1. 


5This regular thread P; can be visualized as was done in Section 1. 
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5 Expressiveness of C-Programs 


In this section we introduce the notion of a ‘C-program’. Furthermore we 
discuss a basic expressiveness result: we show that each regular thread is 
the thread extraction of some C-program. Finally we establish that we do 
not need all of C’s instructions to preserve expressiveness. 


Definition 1. A C-program is a piece of code X =71;...;in withn > 0 
such that the computation of |X|; for each 7 =1,...,n does not use equa- 
tion (2). In other words, there are no jumps outside the range of X and 
execution can only end by executing either the termination instruction ! or 
the abort instruction # . 


In the setting of program algebra we explicitly distinguished in [9] a 
“program” from an instruction sequence (or a piece of code) in the sense 
that a program has a natural and preferred semantics, while this is not the 
case for the latter one. Observe that if X and Y are C-programs, then so 
is X;Y. A piece of code that is not a program can be called a program 
fragment because it can be extended to a program that yields the same 
thread extraction. This follows from the next proposition, which states 
that position numbers can be relativized. 


Proposition 1. Fork © N and X a C-expression, 
1. |X|q = [#5 X|e41, 
2. |X|e=|X3F# |e. 

Moreover, in the case that X is a C-program and1<k < &(X), 
3. |X|a = |/#k; X|7, 


4. |X |p = |X;\#O(X)+1-—kl|O. 
With properties 1 and 2 we find for example 
+/a;\#2|~ = |+/a;\#2I1 

= |# 5 +/a;\#2; # |2, 


and since the latter piece of code is a C-program, we find with property 3 
another one that produces the same thread with left-to-right thread extrac- 
tion: 


|# 5 +/a; \##2; # lo = |/42; 4 5 +/a;\#2; 4 |. 
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Of course, for property 3 to be valid it is crucial that X is a C-program: for 


example 


|+/a;\##2|~ = |+/a;\#2|1 
F |/##1; +/a; \#2|7. 


A similar example contradicting property 4 for X not a C-program is easily 


found. 


Theorem 3. Each regular thread in Treg is produced by a C-program. 


Proof. Assume that a regular thread P, is specified by linear equations 


Pi =t,...,P, =tn. We transform each equation into a piece of C-code: 
P=SOl#s#, 
P=DO#;#5F, 
+/a; /#p; /#9 if p,q > 0, 
: : — ifp>0, ¢< 0, 
Pia pagers HTP eg) EP Dd 
+/a;\#(—p); /#a ifp <0, q>0, 
+/a;\#(—p);\#(-q) ifp,a <0, 


where p = 3(j —7) —1 and q = 3(k—1) —2 (so p,q € Z\ {0}). Concatenating 
these pieces of code in the order given by P,,..., P, yields a C-expression X 
with |X|~ = P,. By construction X contains no jumps outside the range 
of instructions and therefore X is a C-program. Finally, note that the 
instructions of X are in the set {+/a, /#k,\#k,!,#|aeA, k © N*}. 


From the proof of Theorem 3 we infer that only positive forward test 
instructions, jumps and termination are needed to preserve C’s expressive- 
ness: 


Corollary 1. Let C~ be defined by allowing only instructions from the set 
{+/a, /#k,\#k,!|aeA, ke N*}. 
Then each regular thread in Treg can be produced by a program in C~. 


Proof. With # added to the instruction set mentioned, the result follows 
immediately from the proof of Theorem 3. The use of # in that proof can 
easily be avoided, for example by setting 


P,=S+o!;/#1;\#1 
P,=D /#1;/#1;\#1 


(instead of !;#;#), 
(instead of # ;#;#). 
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The resulting expression clearly contains no jumps outside its range and is 


hence a C-program. 


6 Thread Extraction Preserving Homomorphisms 


In this section we consider functions on C’ that preserve thread extraction. 
We start with a homomorphism that turns all basic and test instructions 
into their forward counterparts, and another one that only yields positive 
forward test instructions. Then we consider an anti-homomorphism that 
relates extraction with right-to-left thread extraction. So, these functions 
are very basic examples of program transformation. 


Let the function h : C — C be defined on C-instructions as follows: 


far fa; /#2;#, 
+/ar> +/a; /#2; /#4, 
—/ar> —/a; /#2; /#4, 
[$k /#8k; #5#, 
\ar> /a;\#4;#, 
+\ar +/a;\#4;\#8, 
—\a > —/a;\#4;\#8, 
\#k > \#8k  #5#, 
Lioli#s#, 
HOH SSF. 


So, h replaces all basic and test instructions by fragments containing only 
their forward counterparts. Defining 


W(X; ¥) = h(X);h(Y) 


makes h an injective homomorphism (a ‘monomorphism’) that preserves the 
equivalence obtained by (left-to-right) thread extraction, i.e., 


|X]~ = |ACX)|™. 
This follows from the more general property 


LX] 541 = |A(X) [3544 
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for all 7 < €(X), which is easy to prove by case distinction. So, |X|7 = 
|h*(X)|~, and, moreover, if X is a C-program, then so is h*(X). 

Of course many variants of the homomorphism h satisfy the latter two 
properties. A particular one is the homomorphism obtained from h by 
replacement with the following defining clauses: 


/ar> +/a; /#2; /#1, 
/ar> +/a; /#5; /#1, 
\ar> +/a;\#4; \#5, 
\ar> +/a;\#4;\#8, 
because now only forward positive test instructions occur in the homomor- 


phic image. In other words: with respect to thread extraction, C’s expres- 
sive power is preserved if its set of instructions is reduced to 


{+/a, /#k,\#k,!,4# |ae A, kEN*}. 


This is the syntactic counterpart of Corollary 1 in Section 5. 
Let g: C — C be defined on C-instructions as follows: 


a> ## 3\#2; \a, 
+/a +> \#4;\#2; +\a, 
—/a- \#4;\#2; —\a, 
[tk > $54 5 \#8k, 

\a +> #3 /##45 \a, 
+\a +> /#8;/#4;+\a, 
—\ar> /#8; /#4; —\a, 
\#k > #5; /4#3k, 

lio #34; !, 

HORSE 

So, g replaces all basic and test instructions by C-fragments containing only 


their backward counterparts. Defining g(X;Y) = g(Y);g(X) makes g an 
anti-homomorphism that satisfies 


XI7 = |g X)/- 


This follows from a more general property discussed in Section 8. 
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7 Structural Bijections and TEC-Automorphisms 


In this section we define structural bijections on the finite state threads over 
A as a natural type of (bijective) thread transformations. We then describe 
and analyze the associated class of automorphisms on C’, which appear to 
be generated from simple involutions. 


Given a bijection ¢ on A (thus a permutation of A) and a partitioning 
of Ain Atrue and Agaise, we extend ¢ to a structural bijection on BTA by 
defining for all a € A and P,Q € BTA, 


oD) =D. 
a(S) =5, 
o(P dab Q) = fee (a) > 6(Q) if (a) € Areue. 
a $(Q) Joa) O(P) if O(a) € Asarse- 


Structural bijections naturally extend to Treg: if P; is a fixed point for 
x; in the finite linear specification {x; = t;(%) |i =1,...,n}, then (P;) is 
a fixed point for y; in 


{yi = O(ti(&)) |t=1,...,n, Oxi) = yi}. (6) 
As an example, assume that ¢(a) = b € Agaise and thread P is given by 
P=PaAahQ,Q=D 
then P’ = $(P) is defined by 
PSaOate PO =D; 


Theorem 4. There are 2'4I. |A|! structural bijections on BTA, and thus on 
Treg: 


Proof. Trivial: if |A] =n, there are 2” different partitionings in Ayyue and 
Agaise, and n! different bijections on A. 


Each structural bijection can be written as the composition of a (pos- 
sibly empty) series of transpositions or ‘swaps’ (its permutation part) and a 
(possibly empty) series of postconditional ‘flips’ that model the false-part 
of its partitioning. So, for a fixed ¢ there exist k and m such that 


p= flip,, 0... 0 flip, © SWAPy, p, O-+- © SWAD a, b, 
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where swap, models the exchange of actions a and b, and flip, the postcon- 
ditional flips for Agaise = {C1,---,;Cm}, and ¢ is the identity if k = m = 0. 
More precisely, 


SWOPa»(P Ice Q)= 


é=b ifc=a, 
SWAP, ,(P) LES swap, ,(Q) with C=O he]: 
€=c_ otherwise, 
and 
— flip <ab flip ifa- 
finde SokQyen et ee ee 
flip.(P) dab flip.(Q) otherwise. 
For A = {a1,...,@n} we can do with n — 1 swaps swap,, 4, (1 <j <n) as 


these define any other swap by SWap,, ., = SWGPq, a, ° SWAPg, a; ° SWAP a, a; 
and n flips flip,, (1 <i <n). 

We show that structural bijections naturally correspond with a certain 
class of automorphisms on C. 


Definition 2. An automorphism a on C is thread extraction compatible 
(TEC) if there exists a structural bijection 3 such that the following diagram 
commutes: 


Phe 


C —>+ Treg 
La 16 


Cc Ele Treg 


Theorem 5. The TEC-automorphisms on C' are generated by 


swap,,: exchanges a and b in all instructions containing a or b, 


flip, : exchanges + and — in all test instructions containing a, 
where a and b range over A. 


Proof. First we have to show that if a is generated from swap, and flip, 
(a,b € A), then a is a TEC-automorphism. This follows from the fact that 
the diagram in Definition 2 commutes for swap, if we take G = swap, 
and for flip, if we take 3 = flip,. We show this below. 
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Then we have to show that if a is a TEC-automorphism, then qa is 
generated from swaps and flips. Above we argued that each structural 
bijection can be characterized by zero or more swap, , and flip, applications. 
So, again it suffices to argue that for 6 = swap,,, the diagram commutes 
if a = swap, and for 3 = flip, if a = flip,. The general case follows from 
repeated applications. 

Let X € C. First assume 3 = flip,. Following the construction in 
the proof of Theorem 2 we find a finite linear specification {x; = t; | i = 
1,...,n} with n = €(X) such that |X|; is a fixed point for x;. Transforming 
this specification according to (6) with ¢ = flip, yields {y; = flip,(t;(Z)) | 
i=1,...,n, flip.(xi) = yi}. Now |flip.(X)|; is a fixed point for y;: this also 
follows from the construction in the proof of Theorem 2 and the fact that 
flip, only changes the sign of +/c and +\c in X. 

We now show that flip,(|X|;) is a fixed point for y; by a case distinction 
on the form of t; in the equations x; = t; (t= 1,...,n): 


e Ifa; =a; dc a, then |X|; = |X|; dc |X|x, so 
flip.(|X |i) = flip,(|X|j Je ® |X|x) 
= flip.(|X|x) Ic& flip.(|X|;). 
Note that in this case y; = y, Ick y;. 
e Ifxj,=a2; dab a, witha ¥c, then |X|; = |X|; da |X|x, so 
flip.(|X |i) = flip.(|X|j Ja |X|x) 
= flip,(|X|;) Ja& flip.(|X |x). 
Note that in this case y; = yj Jab yp. 
e If x; =S, then |X|; =S and y, =S. Also flip.(|X|;) =S. 
e If t; =D, then |X|; =D and y; =D. Also flip,(|X|;) = D. 


So in all cases flip,(|X|;) is a fixed point for y;. Hence, |flip.(X)|; = 


flip.(|X|i) and thus |flip,.(X)|~ = flip.(|X|~). 
In a similar way it follows that |swap, ,(X)|i = 3WaDq p(X |i). 


Note that swap,, is the identity and so is flip, o flip,. Furthermore, 
for a # b we have swap,» = swapy, and 


flip, Oo SWADa b if'¢ ¢ {a, b}, 


SWAPa pb © De = . 3 
Lee eee if {a,b} = {c,d}. 
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This implies that each TEC-automorphism can be represented as 


flip, Os. MD. SWADg, b; O-++ 0 SWADg, dy, - 


Similarly as remarked above, for A = {a,,...,@,} we can do with n — 1 
SWAPS SWOPq, a, (1 <j <7) as these define any other swap. 

We further write TEC-AUT for the set of TEC-automorphisms, and 
we say that swap,, and the structural bijection swap, are associated, and 
similar for flip, and flip,. So, the above result states that for the associ- 
ated pair a € TEC-AUT and structural bijection @ the following diagram 
commutes: 


C a Teg 
la la 


eal 


C — Treg 
The following corollary of Theorem 5 follows immediately. 


Corollary 2. Ifa © TEC-AUT, then a preserves the orientation of all 
instructions and a(i) =i for i € {/#k,\#k,!,# | ke N*}. Further- 
more, for each a € A, a is determined by its value on one of the possible 
four test instructions. If for example a(+/a) = —/b, then a(/a) = /b, 
a(—/a) = +/b, and the remaining identities are given by replacing all for- 
ward slashes by backward slashes. 


Each element a € TEC-AUT that satisfies a?(u) = u for all C- 
instructions u is an involution, i.e. 


a(x) =X. 


Obvious examples of involutions are swap, » and flip,, and a counter-example 
is 


a = flip, o swap, y 
because 


a? = Slipyo swap, po flipyo swapg » = flipyo flip, ° swapg 4° SWADg.p = flip,oflip,. 


2 


However, a* is an involution (because compositions of flip commute). 
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8 TEC-Anti-Automorphisms 


In this section we consider the relation between structural bijections on 
threads and an associated class of anti-automorphisms on C’. Recall that a 
function ¢ is an anti-homomorphism if it satisfies 6(X;Y) = (Y); ¢(X). 
Furthermore, we show how the monomorphism hf defined in Section 6 is 
systematically related to the anti-homomorphism g defined in that section. 


Define the anti-automorphism rev : C — C (reverse) on C-instructions 
by the exchange of all forward and backward orientations: 


/a a \a, 
+/ar +\a, 
—/a ie —\a, 


/#k > \#k, 


\ar /a, 
+\atr> +/a, 
—\ar —/a, 
\#k > /#k, 
leet, 
HOF. 
Then rev?(X) = X, so rev is an involution. Furthermore, it is immediately 
clear that for all X € C, 
[LX |7eoEX) |=. 


Definition 3. An anti-automorphism a on C is thread extraction com- 
patible (TEC) if there exists a structural bijection 3 such that the following 
diagram commutes: 

ee JB hs, 

la 18 

Gg ee 

We write TEC-AntiAUT for the set of thread extraction compatible 

anti-automorphisms on C’. The following result establishes a strong connec- 
tion between THC-AUT and TEC-AntiAUT. 
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Theorem 6. TEC-AntiAUT = {revoa|aé€ TEC-AUT}. 


Proof. Let y € TEC-AntiAUT, so y is an anti-automorphism and there 
is a structural bijection 6 such that |y(X)|~ = G(|X|~) for all X. By 
Theorem 5, 3 = @ for some a € TEC-AUT and ((|X|~) = |a(X)|~ for all 
X, and thus 


|ly(X)|~ = |revoa(X)|~ for all X. (7) 


This defines y on {/#k,\#k,!,# | k € Nt}. By Corollary 2, a is de- 
termined by its definition on all positive forward test instructions. So, if 
for a,b € A, a(+/a) = +/b then we find by (7) with X = +/a;! that 
y(—\a) = +\b. Since a is determined for all other instructions containing 
a, also y is fully determined for all instructions containing a. It follows that 
y= revoa, thus y € {revca|a€e TEC-AUT}. 

Conversely, if y € {revoa |a€ TEC-AUT}, say y = revoa with 
a € TEC-AUT, then |y(X)|~ = |a(X)|~ = GB(|X|~) for some structural 
bijection @ and all X. Furthermore, y is an anti-automorphism, so y € 
TEC-AntiAUT. 


Observe that for all a € TEC-AUT, ao rev = revoa and for all 

a, 3 € TEC-AntiAUT, aoG € TEC-AUT. Using the notation for associated 
pairs we find for G = revoa € TEC-AntiAUT that the following diagram 
commutes: 

C Ee Treg 

{6 Le 

C mle Treg 
Note that we use @, i.e., the associated structural bijection of a, in this 
diagram. 


Another application with rev is the following: for h:C—-7 Ca 
monomorphism, the following diagram commutes: 


GS we 
Lh yer 
C ae lee 


As an example, consider the anti-homomorphism g defined in Section 6: 
indeed g = rev oh for the homomorphism h defined in that section. 
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9 Expressiveness and reduced instruction sets 


In this section we further consider C’s instructions in the perspective of 
expressiveness. We show that setting a bound on the size of jump counters 
in C' does have consequences with respect to expressiveness: let 


Ck 
be defined by allowing only jump instructions with counter value k or less. 


We first introduce some auxiliary notions: following the definition of 
residual threads in Section 2, we say that thread Q is a 0-residual of thread 
Pif P=Q, and an n+1-residual of P if for some a € A, P= Pi Jab Py 
and Q is an n-residual of P; or of Py. Note that a finite thread (in BTA) 
only has n-residuals for finitely many n, while for the thread P defined by 
P=a0oP it holds that P is an n-residual of itself for each n € N. 

Let a € A be fixed and n € N*. Thread P has the a-n-property if 
™(P) = a" oD and P has 2” — 1 (different) n-residuals which all have a 
first approximation not equal to aoD. So, if a thread P has the a-n-property, 
then n consecutive a-actions can be executed and each sequence of n replies 
leads to a unique n-residual. Moreover, none of these residual threads starts 
with an a-action (by the requirement on their first approximation). We note 
that for each n € N* we can find a finite thread with the a-n-property. In 
the next section we return to this point. 

A piece of code X has the a-n-property if for some i, |X|; has this 
property. It is not hard to see that in this case X contains at least 2” — 1 
different a-tests. As an example, consider 


X =!;\b; +\a; +/a; \#2; +/a; /#2; /o # 


Clearly, X has the a-2-property because |X|4 has this property: its 2- 
residuals are bo S, S, D and co D, so each thread is not equal to one of 
the others and does not start with an a-action. 


Note that if a piece of code X has the a-(n + k)-property, then it also 
has the a-n-property. In the example above, X has the a-1-property because 
|X |3 has this property (and |X| too). 


Lemma 1. For each k € N there exists n € Nt such that no X € Cy has 
the a-n-property. 
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Proof. Suppose the contrary and let & be minimal in this respect. Assume 
for each n € Nt, Y, € Cy has the a-n-property. 
Let B = {true, false}. For a, 3 € B* we write 


ax p 


if a is a prefix of @, and we write a < Gor G>aifax Banda F £#. 
Furthermore, let 


n 
Bo" — Bt, 
U 


thus BS” contains all B*-sequences a with f(a) <n (there are 2"t! — 1 
such sequences). 
Let g: NN be such that |Yn|g(n) has the a-n-property. Define 


fn: BS” 4 NT 


by fn(a) = m if the instruction reached in Y,, when execution started at 
position g(n) after the replies to a according to a has position m. Clearly, 
fn is an injective function. 

In the following claim we show that under the supposition made in this 
proof a certain form of squeezing holds: if k’ is sufficiently large, then for 
all n > 0 there exist a, 6,7 € BY with Fh) fp) aay) 
with the property that fxrin(a) < frrin(B’) < ferin(y) for each extension 
B' of 8 within BS*+"_ This claim is proved by showing that not having 
this property implies that “too many” such extensions (’ exist. Using this 
claim it is not hard to contradict the minimality of k. 


Claim 1. Let k’ satisfy 2 > 2k +3. Then for all n > 0 there exist 
a, 8,7 € BY with 


fretn(@) < frtn(G) < frtn (7) 
such that for each extension 3’ = B in BSR tn. 
fe-4n(@) < frin(B’) < fartn(Y)- 


Proof of Claim 1. Let k’ satisfy 2*° > 2k +3. Towards a contradiction, 
suppose the stated claim is not true for some n > 0. The sequences in BF 
are totally ordered by fx in, say 


Srivn{oa) < feren(@2) <2 = Fit 4n(Qgx! ). 
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Consider the following list of sequences: 
1, 12, - +, A2k+4+2, W2k+3 
——S 
choices for @ 


By supposition there is for each choice 3 € {a2,...,Q2%42} an extension 
B'> Bin BSK'+” with 


either Tee) < fara (or), or fr+n(B’) > fri-4n(Q2K+43)- 


Because there are 2k + 1 choices for 3, assume that at least k + 1 elements 
BE {ag,...,Q2~42} have an extension (’ with 


fean(@’) < frrtn(ar) 


(the assumption fp 4n(6’) > fern (Q2n+3) for at least k+1 elements 3 with 
extension (3’ leads to a similar argument). Then we obtain a contradiction 
with respect to fp4: for each of the sequences (7 in the subset just selected 
and its extension (3’, 


frrin(6’) < fre+n(ar) < Fetal Ps 


and there are at least k+1 different such pairs 3, 3’ (recall fy, is injective). 
But this is not possible with jumps of at most k because the fy: 4, values of 
each of these pairs define a path in Y;/,,, that never has a gap that exceeds 
k and that passes position fy:1,(a1), while different paths never share a 
position. This finishes the proof of Claim 1. 


Take according to Claim 1 an appropriate value k’, some value n > 0 
and a, 3,7 € B* . Consider Yxe4n and mark the positions that are used for 
the computations according to a and y: these computations both start in 
position g(k’ + n) and end in fxr4n(a@) and frrin(y), respectively. Note that 
the set of marked positions never has a gap that exceeds k. 

Now consider a computation that starts from instruction fiy+n(3) in 
Yx/4n, a position in between fxri,(@) and fipvin(y). By Claim 1, the first 
n a-instructions have positions in between fxr4n(@) and fx4n(y) and none 
of these are marked. Leaving out all marked positions and adjusting the 
associated jumps yields a piece of code, say Y, with smaller jumps, thus 
in Cy_1, that has the a-n-property. Because n was chosen arbitrarily, this 
contradicts the initial supposition that k was minimal. 
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Theorem 7. For any k € N*, not all threads in BTA can be expressed in 
Cy. This is also the case if thread extraction may start at arbitrary positions. 


Proof. Fix some value k. Then, by Lemma 1 we can find a value n such 
that no X € Cy, has the a-n-property. But we can define a finite thread 
that has this property. 


In the next section we discuss a systematic approach to define finite 
threads that have the a-n-property. 


10 Boolean Registers for Producing Threads 


In this section we briefly discuss the use of Boolean registers to ease pro- 
gramming in C. This is an example of so-called thread-service composition. 
In appendix C we provide a brief but general introduction to thread-service 
composition. 


Consider Boolean registers named 61, 62,...,6n which all are initially 
set to F' (false) and can be set to T (true). We write bi(b) with b € {T, F} 
to indicate that bi’s value is b. The action bi.set:b sets register bi to b and 
yields true as its reply. The action bi.get reads the value from register bi 
and provides this value as its reply. The defining rules for threads in BTA 
that use one of these registers are for b,b! € {T, F},i € {1,...,n}: 


S /1i bi(b) =S, 
D /1; bi(b) = D, 
(P <1 bi.set:b' & Q) /1; bi(b) = P /4; bi(0’), 
P/y bi(b)  ifb=T, 


(P dl bi.get © Q) /4; bi(b) = en bi(b) if b = F, 


and, if none of these rules apply, 


(P Jab Q) / 0; bi(b) = (P / 4: bi(b)) Ja & (Q /4; bi(b)). 


The operator /»; is called the use operator and stems from [8]. Observe 
that the requests to the service bi do not occur as actions in the behavior 
of a thread-service composition. So the composition hides the associated 
actions. 
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As a simple example consider the C-program X that has extra instruc- 
tions based on the set {bi.set:b, bi.get |b € {T, F}, 7 € {1,2}}: 
X= +/a;/b1.set:T; 
+/a; /b2.set:T; 
+/b1.get; c; d; 
+/b2.get;c;d; ! 
Then one can derive (recall the initial value of bl and 62 is F): 
(IX|™ /oz 61) /o2 62 = (|X|3 /o1 O1(T) Ja B [X]3 /o1 DI(F)) /og 62 
= (Ri Jab Ro) dab (R3 cab Ry) 
where Ry = codocodoS (case T,T), Rpg = cododoS (case T, F), 
R3 = docodoS (case F,T), and Rg = dodoS (case F,F). So, the 
four possible combinations of the values of b1 and 62 yield the different 2- 
residuals R,,...,R4. Clearly, X has the a-2-property. The particular form 


of the C-program X already suggests how to generalize X to a family of 
C-programs Z,, (n € Nt) such that 


((IZn|~ /o1 61)...) /on bn 
has the a-n-property: 

Zn =+/a; /b1.set:T; 
+/a; /b2.set:T; 


+/a; /bn.set:T; 
+/bl.get; cd; 
+ /b2.get; c;d; 


+/bn.get; c;d;! 


Each series of n replies to the positive testinstructions +/a has a unique 
continuation after which Z,, terminates successfully: the number of true- 
replies matches the number of c-actions, and their ordering that of the 
occurring d-actions. Obviously, each thread ((|Zn|~ /57 61)...) /pn bn is a 
finite thread in BTA and can thus be produced by a C-program not using 
Boolean registers (cf. Theorem 3). 

More information about thread-service composition is given in Ap- 
pendix C. 
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11 On the Length of C-Programs for Producing 
Threads 


C’-programs can be viewed as descriptions of finite state threads. In this 
section we consider the question which program length is needed to produce 
a finite state thread. We also consider the case that auxiliary Boolean 
registers are used for producing threads, which can be a very convenient 
feature as was shown in the previous section. We find upper and lower 
bounds for the lengths of C-programs. 


For k,n € N® let 
w(k,n) € Nt 


be the minimal value such that each thread over alphabet aj,...,a, with 
at most n states can be expressed as a C-program with at most 7(k,n) 
instructions. Furthermore, let 


Worlk, n) — Nt 


be the minimal value such that each thread over alphabet aj,...,a, with 
at most n states can be expressed as a C-program with at most Wp,(k,n) 
instructions including those to use Boolean registers. 

It is not hard to see that 


w(k,n) <3n and Wor(k, 2) <3n 
because each state can be described by either the piece of code 
+/ai;u;v 


with u and v jumps to the pieces of code that model the two successor 
states, or by ! or #. Presumably, a sharper upper bound for both w(k, n) 
and Wp,(k,n) can be found. 

As for a lower bound for wy,(k,n), we can use auxiliary Boolean regis- 
ters by forward basic instructions 


/bi.set:T 
/bi.set:F 
/bi.get 


83 


and their backward and test counterparts. So, each Boolean register bi 
comes with 18 different instructions, and of course at most W»,(k,n) of 
these can be used. 

Programs containing at most | = w»,(k,n) instructions, contain per 
position i at most | — 1 jump instructions, namely jumps to all other (at 
most | — 1) positions in the program. 

So, if we restrict to k = 1, say /a is the only forward basic instruction 
involved (with backward and test variants yielding 5 more instructions) and 
include the termination instruction ! and the abort instruction # , the 
admissible instruction alphabet counts 


Fa 6411) s181 


instructions. Because | > 1, this is bounded by 26/ instructions, and there- 
fore we count 


(261)! 


syntactically different programs. 
A lower bound on the number of threads with n states over one action 
a can be estimated as follows: let F’ range over all functions 


{1,...,n—1}+ {0,1,...,n—1}, 


thus there are n”~! different F. Define threads PE fork =0,...,n —1 by 


We claim that for a fixed n the threads P*_, (each one containing n states 
pr ee Re ere are for each F different, thus yielding n”~! different threads, 
so we find 


(261)! > n?-!, (8) 
Assume n > 2, thus 26 < 25n, thus n < 26n — 26, thus 6 <n-—1. Suppose 


Le a then 261 < n and 1 <n —1, which contradicts (8). Thus 


n 
= 0G). 
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So, for & = 1 and in fact for arbitrary k > 1 we find 


n 
36 < Wig Rent) Sn. 
In the case that we do not allow the use of auxiliary Boolean registers, 
it follows in a same manner as above that for arbitrary k > 1, 


. < (k,n) < 3n. 


We see it as a challenging problem to improve the bounds of W»,(k, n) 
and w(k,n). 


12 Discussion 


In this paper we proposed an algebra of instruction sequences based on a set 
of instructions without directional bias. The use of the phrase “instruction 
sequence” asks for some rigorous motivation. This is a subtle matter which 
defeats many common sense intuitions regarding the science of computer 
programming. 

The Latin source of the word ‘instruction’ tells us no more than that 
the instruction is part of a listing. On that basis, instruction sequence is a 
pleonasm and justification is problematic.® We need to add the additional 
connotation of instruction as a “unit of command”. This puts instructions 
at a core position. Maurer’s paper A theory of computer instructions [12] 
provides a theory of instructions which can be taken on board in an attempt 
to define what is an instruction in this more narrow sense. Now Maurer’s 
instructions certainly qualify as such but his survey is not exhaustive. His 
theory has an intentional focus on transformation of data while leaving 
change of control unexplained. We hold that Maurer’s theory, including 
his ongoing work on this theme in [13], provides a candidate definition for 
so-called basic instructions. 

At this stage different arguments can be used to make progress. Sup- 
pose a collection Z is claimed to constitute a set of instructions: 


°110]: INSTRUCTION, in Latin instructio, comes from in and struo to dispose or 
regulate, signifying the thing laid down. 

The following is taken from http://www.etymonline.com/. INSTRUCTION: from 
O.Fr. instruction, from L. instructionem (nom. instructio) “building, arrangement, teach- 
ing,” from instructus, pp. of instruere “arrange, inform, teach,” from in- “on” + struere 
“to pile, build” (see structure). 
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1. If the mnemonics of elements of Z are reminding of known instructions 
of some low level program notations, and if the semantics provided 
complies with that view, the use of these terms may be considered 
justified. 


2. If, however, unknown, uncommon or even novel instructions are in- 
cluded in Z, the argument of 1 can not be used. Of course some 
similarity of explanation can be used to carry the jargon beyond con- 
ventional use. At some stage, however, a more intrinsic justification 
may be needed. 


3. A different perspective emerges if one asserts that certain instruction 
sequences constitute programs, thus considering Zt (i.e., finite, non- 
empty sequences of instructions from Z) one may determine a subset 
P CT of programs. Now a sequence in Zt qualifies as a program if 
and only if it is in P. In the context of C-expressions we say that 


+/a;\##10; /b; +/c; /#8;!;! 


is not in P because the jumps outside the range of instructions cannot 
be given a natural and preferred semantics, as opposed to +/a; \#1; ! 
and +/a;/b;+/c; !; 1. We here state once more that we do not 
consider the empty sequence of instructions as a program, or even as 
an instruction sequence because we have no canonical meaning or even 
intuition about such an empty sequence in this context. 


4. The next question is how to determine P. At this point we make use 
of the framework of PGA [7, 14] (for a brief explanation of PGA see 
Appendix A). A program is a piece of data for which the preferred and 
natural meaning is a “sequence of primitive instructions”, abbreviated 
to a SPI. Primitive instructions are defined over some collection A 
of basic instructions. The meaning of a program X is by definition 
provided by means of a projection function which produces a SPI for 
X. Using PGA as a notation for SPIs, the projection function can be 
written p2pga (“P to PGA”). The behavior |X|p for X € P is given 
by 


|X|p = |p2pga(X)| 


where thread extraction in PGA, i.e., |...|, is supposed to be known. 
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5. In the particular case of Z consisting of C’s instructions, we take 
for P those instruction sequences for which control never reaches 
outside the sequence. These are the sequences that we called C- 
programs. First we restrict to C-programs composed from instructions 
in {/a,+/a, —/a, /#k, \#k, !,# | ae A,k © NT} and we define 


Piso ye) See Bea) 


as a “pre-projection function” that uses an auxiliary function w on 
these instructions: 


(Ja) =a, 
~(+/a) = +a, 
~(—/a) = —a, 
gk) = Hk, 
\#h) = wn —k, 

w!)=!, 

VG) = #0. 


We can rewrite each C-program into this restricted form by applying 
the behavior preserving homomorphism /h defined in Section 6. Thus 
our final definition of a projection can be p2pga = F'oh. Note that 
many alternatives for h could have been used as well (as was already 
noted in Section 6). 


6. Conversely, each PGA-program can be embedded into C while its 
behavior is preserved. For repetition free programs this embedding is 
defined by the addition of forward slashes and replacing #40 by #.’ 
In the other case, a PGA-program can be embedded into PGLB, a 
variant of PGA with backward jumps and no repetition operator [7], 
and transformation from PGLB to C is trivial. 


In the case of C’, items 4 and 5 above should of course be proved, i.e., for a 
C-program X, 


IX|~ =|X|c_ (= [p2pga(X)}), 


and for item 6 a similar requirement about the definition of |...|~ should 
be substantiated. We omit these proofs as they seem rather clear. 


"The instruction # already occurred in [6], but was in [7] replaced by #0, thus ad- 
mitting a more systematic treatment of “jumps”. 
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A PGA, a summary 


Let a set A of constants with typical elements a, b,c,... be given. PGA-programs 
are of the following form (a € A, k € N): 


P:s=a|+ta|—a|#k|!|P;P|P°. 


Each of the first five forms above is called a primitive instruction. We write U 
for the set of primitive instructions and we define each element of U/ to be a SPI 
(Sequence of Primitive Instructions). 

Finite SPIs are defined using concatenation: if P and Q are SPIs, then so is 


P;Q 


which is the SPI that lists Q’s primitive instructions right after those of P, and we 
take concatenation to be an associative operator. 
Periodic SPIs are defined using the repetition operator: if P is a SPI, then 


PY 


is the SPI that repeats P forever, thus P; P;P;.... Typical identities that relate 
repetition and concatenation of SPIs are 


(P,P) =P” and (P;Q)" = P;(Q;P). 
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Another typical identity is 
POOL", 


expressing that nothing “can follow” an infinite repetition. 

The execution of a SPI is single-pass: it starts with the first (left-most) in- 
struction, and each instruction is dropped after it has been executed or jumped 
over. 

Equations for thread extraction on SPIs, notation |X|, are the following, where 
a ranges over A, u over the primitive instructions U/, and k € N: 


[!]=S [!; X|=S 
jal =aoD la; X|=ao|X| 
|+a| =aoD |ta; X| = |X| dab |#2; X| 
|-a| =aoD |—a; X| = |#2;X| dab |X| 
Ik] =D #0; X| =D 
|#1; X| = |X| 
|#k+2;u| =D 
|#¢k+2; u; X| = |#k+1;X| 


For more information on PGA we refer to [7, 14]. 


B_ Basic Thread Algebra and Finite Approxima- 
tions 

An elegant result based on [2] is that equality of recursively specified regular threads 

can be easily decided. Because one can always take the disjoint union of two 


finite linear recursive specifications, it suffices to consider a single specification 
{P, =t;|1<i<n}. Then P,; = P; follows from 


Tn—1(P;) => Tn—1(P;). 


Thus, it is sufficient to decide whether two certain finite threads are equal. We 
provide a proof sketch: 

For k > 0 consider the equivalence relation &, on {P,,...,P,} defined by 
P; — P; if TK (P;) = Th(P;). Then 


9 2 S12 S22... (9) 
If =, = Sp41 then =.41 = “e429. This follows from (9) and 2.41 C S442. Sup- 


pose the latter is not true, then 7,41(P;) = te41(P;) while m,42(Pi) 4 te+2(P;)- 
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The only possible cases are that P; = Pm Jab P, and Pj = Py dab Py and 
Tr+1(Pim) # Tr+1( Pm’) or Tr+1(P1) fF Tr+1( Py). So by — =k4+1) at least 
one of 7%(Pm) #4 7e(Pm’) and m(P:) 4 7:,(Py) must be true, but this refutes 
Tr+1(Pi) = Te41(P;). So, once the sequence (9) becomes constant, it remains con- 
stant. Since this sequence is decreasing and the maximum number of equivalence 
classes on {P;,..., P,} is n, at most the first n relations in the sequence can be un- 
equal, hence =,-1 = =, and thus t,~1(P;) = tn—1(P;) implies m,(P;) = te(P;) 
for all k EN. 

It is not difficult to show for threads P and Q: if m(P) = 7,(Q) for allk EN 
then P = Q. First, each (infinite) thread is a projective sequence on which 7, is 
defined componentwise. Secondly, for a projective sequence (P,,)nen it follows that 
TK (Pr) = T(E (Pr+1) = TK (Pr41) = Pr forallk EN. So, for (Qn) nen a projective 
sequence, Py = Tr (Pr) _ T(Q) = Qs for all k implies (Pn) nen = (Qn) nen: 


C Thread-Service Composition 


Most of this text is taken from [14]. A service, or a state machine, is a pair (X, F’) 
consisting of a set & of so-called co-actions and a reply function F. The reply 
function is a mapping that gives for each non-empty finite sequence of co-actions 
from & a reply true or false. 


Example 2. A stack can be defined as a service with co-actions push:i, topeq:i, 
and pop, fori = 1,...,n for some n, where push:i pushes i onto the stack and 
yields true, the action topeq:1 tests whether i is on top of the stack, and pop pops 
the stack with reply true if it is non-empty, and it yields false otherwise. 


Services model (part of) the execution environment of threads. In order to 
define the interaction between a thread and a service, we let actions be of the form 
c.m where c is the so-called channel or focus, and m is the co-action or method. 
For example, we write s.pop to denote the action which pops a stack via channel 
s. For service H = (X, F’) and thread P, P /. H represents P using the service H 
via channel c. The defining rules for threads in BTA are: 


S/-<H=S, 

DietHt=—D, 
(Pdem&Q)/H=(P/eH) de.mb(Q/eH) ite #e, 
(PdcemPQ)/.H=P/.H ifmeX and F(m) =true, 
(PdcemeQ)/-H=Q/-H' ifme€® and F(m) = false, 
(PidemeQ)/-H=D ifm¢®, 


where H’! = (5, F’) with F’(c) = F(mco) for all co-action sequences o € UT. 
The operator /, is called the use operator and stems from [8]. An expression 
P /.H is sometimes referred to as a thread-service composition. The use operator 
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is expanded to infinite threads in BTA®™ by defining 


(Pr)new /eH = |_| Pr /c H- 


nen 


(Cf. [4].) It follows that the rules for finite threads are valid for infinite threads 
as well. Observe that the requests to the service do not occur as actions in the 
behavior of a thread-service composition. So the composition not only reduces 
the above-mentioned non-determinism of the thread, but also hides the associated 
actions. 

In the next example we show that the use of services may turn regular threads 
into non-regular ones. 


Example 3. We define a thread using a stack as defined in Example 2. We only 
push the value 1 (so the stack behaves as a counter), and write S(n) for a stack 
holding n times the value 1. By the defining equations for the use operator it follows 
that for any thread P, 


(s.push:1 0 P)/, S(n) = P/, S(n+1), 
(P< s.pop&S)/. S(0) =S, 
(P< s.pop&S)/, S(n+1) = P/; S(n). 


Now consider the regular thread Q defined by 
Q=s.push:loQdabR, R=boRAs.popeS, 
where actions a and b do not use focus s. Then, for alln EN, 


Q/s S(n) = (s.push:1 oQ Jak R) /s S(n) 
=(Q/s S(n+1)) dab (R/s S(n)). 


It is not hard to see that Q /, S(O) is an infinite thread with the property that for 
alln, a trace of n+1 a-actions produced by n positive and one negative reply ona 
is followed by b” oS. This yields an non-regular thread: if Q /, S(O) were regular, 
it would be a fixed point of some finite linear recursive specification, say with k 
equations. But specifying a trace b® oS already requires k + 1 linear equations 
LZ, = b04%9,..., 0p = 00 p41, 041 = S, which contradicts the assumption. So 
Q/, S(O) is not regular. 


Finally, we note that the use of finite state services, such as Boolean registers, 
can not turn regular threads into non-regular ones (see [8]). More information on 
thread-service composition can be found in e.g. [14]. 
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